# Week 4 Monday 7/10 brief notes.

Date modified: Tue 2023-07-11 . 04:26 AM
Date created: Sat 2023-10-07 . 09:02 PM
# Week 4 Monday 7/10 brief notes. Reading: 7.7 (Approximating integrals) and 7.8 (Improper integrals) ## Approximating integrals. (7.7) So far we have learned various techniques to find the antiderivative of some function $f(x)$, and the hope is to express the antiderivative as some elementary function. But this is not always possible, for example $$ \int e^{x^{2}} dx \quad\text{or}\quad \int \frac{\sin(x)}{x}dx \quad\text{or}\int \sqrt{x^{3}+1}dx $$do not posses elementary expressions, despite having antiderivatives! For instance, using FTC, $$ g(x) = \int_{0}^{x}e^{t^{2}}dt $$is an antiderivative of $e^{x^{2}}$ ! Ok, so if we cannot write it down exactly as elementary functions, the next best thing is to **approximate them**. There are two approaches, one is numerically and the other is using a power series expansion (which we will see later). We will see how numerically this works. ## Riemann sum approximations. Given a function $f(x)$ that is integrable (say continuous) over the interval $[a,b]$, we can approximate the definite integral $$ \int_{a}^{b} f(x)dx $$geometrically by using the Riemann sum expression of the integral: $$ \int_{a}^{b}f(x)dx=\lim_{n\to\infty} \sum_{i=1}^{n} f(x_{i}^{\ast})\Delta x $$where we partition the interval $[a,b]$ into sub-intervals of length $\Delta x$ and $x_{i}^{\ast}$ is **any** point in that interval. Now, this suggests that if we were to only use some finitely many $N$ pieces, then we get a reasonable approximation: $$ \int_{a}^{b} f(x)dx \approx \sum_{i=1}^{N}f(x_{i}^{\ast})\Delta x $$ Here we take $\Delta x = \frac{b-a}{N}$ (we equally partition the interval $[a,b]$ into $N$ pieces), and $x_{i}^{\ast}$ is any sample point in that interval. Often for simplicity sake, we pick $x_{i}^{\ast}$ to be either the left end points of the subintervals, or the right end points of the subintervals. This gives us the following estimates > Fix $N$, let $\displaystyle\Delta x = \frac{b-a}{N}$ and $x_{i} = a + i \Delta x$. This chops the interval $[a,b]$ into $N$ equal spaces, with width $\Delta x$. > The **left end point Riemann sum estimate** is $$ \begin{align*} L_{N} & =f(x_{0})\Delta x + f(x_{1})\Delta x + \cdots + f(x_{N-1})\Delta x \\ & = \sum_{i=1}^{N} f(x_{i-1})\Delta x \end{align*} $$And the **right end point Riemann sum estimate** is $$ \begin{align*} R_{N} & =f(x_{1})\Delta x + f(x_{2})\Delta x +\cdots + f(x_{N})\Delta x \\ & = \sum_{i=1}^{N}f(x_{i})\Delta x \end{align*} $$ Here we denote $L_{N}$ to be the $N$ piece left end point Riemann estimation, and $R_{N}$ to be the $N$ piece right end point Riemann estimation for $\int_{a}^{b}f(x)dx$. Geometrically, they look like the following: !!! INSERT IMAGE HERE !!! Ok, so we have the true definite integral $\displaystyle\int_{a}^{b} f(x)dx \approx L_{N}$ and $\displaystyle\int_{a}^{b}f(x)dx\approx R_{N}$. But how good is it? Intuitively, if $N$ increases, the better the approximation. We can be even more quantitatively precise than this: > **Theorem. Error bounds for left or right end point Riemann sum estimations.** > If $f$ is differentiable, then $$ \text{error}=\left|\int_{a}^{b}f(x)dx-L_{N}\right| \le K_{1} \frac{(b-a)^{2}}{2N} $$and $$ \text{error}=\left|\int_{a}^{b}f(x)dx-R_{N}\right| \le K_{1} \frac{(b-a)^{2}}{2N} $$ where $K_{1}$ is **any** upper bound of $|f'(x)|$ on $[a,b]$. In other words, the true value of the integral $\int_{a}^{b}f(x)dx$ is within $L_{N}\pm\text{error}$ if we use $N$ pieces of left endpoint Riemann estimate. And within $R_{N}\pm\text{error}$ if we use $N$ pieces of right end point Riemann sum estimate. (The proof of this is via **mean value theorem**) **Example.** Let us consider the definite integral $\displaystyle\int_{1}^{2} \frac{\sin(x)}{x}dx$. As you have guessed, there is no nice closed form for this. Let us say we use a right end point Riemann sum estimation with $N=10$. Then $\Delta x = \frac{2-1}{10}=0.1$, with right end points $x_{1},x_{2},\ldots,x_{10}=1.1,1.2,\ldots,2.0$. So we get $$ \begin{align*} R_{10} & = \frac{\sin(1.1)}{1.1}\cdot0.1+ \frac{\sin(1.2)}{1.2}\cdot 0.1 + \cdots+ \frac{\sin(2.0)}{2.0}\cdot 0.1 \\ & = 0.639876926972.... \end{align*} $$ With a left end point Riemann sum estimation with $N=10$, what changes now is we use the left end points instead, so $x_{0},x_{1},\ldots,x_{9}=1.0,1.1,\ldots,1.9$. So we get $$ \begin{align*} L_{10} & = \frac{\sin(1.0)}{1.0}\cdot0.1+ \frac{\sin(1.1)}{1.1}\cdot 0.1 + \cdots+ \frac{\sin(1.9)}{1.9}\cdot 0.1 \\ & = 0.678559154111.... \end{align*} $$ Now, on the interval $[1,2]$, the function $f(x)=\frac{\sin(x)}{x}$ is decreasing (we know this by its graph). And this would imply $L_{10}$ will be an **overestimation** of $\int_{1}^{2} \frac{\sin(x)}{x}dx$, while $R_{10}$ will be an **underestimation**. (If the function is increasing instead, then this is reversed. If a function both increases and decreases on an interval, then it is not as clear.) Ok, but how good is this estimate? And how many pieces do we need to get the error to within a desired tolerance? For instance, if we want our $L_{N}$ or $R_{N}$ estimates to be within $0.001$ of the true value of $\int_{1}^{2} \frac{\sin(x)}{x}dx$, how big of an $N$ do we need? Using the theorem, we should first find an upper bound $K_{1}$ for the absolute value of the first derivative, $|f'(x)|$ over the interval $[1,2]$. We can do this roughly (and in harder analysis one would try to find as best as a bound as possible). Note for $f(x)=\frac{\sin(x)}{x}$, we have$$ \begin{align*} f'(x) & = \frac{x\cos(x)-\sin(x)}{x^{2}} \end{align*} $$ Now, on $[1,2]$, we have $\frac{1}{x^{2}} \le 1$. So $$ |f'(x)| = \frac{|x\cos(x)-\sin(x)|}{x^{2}} \le |x\cos(x)-\sin(x)| $$ on $[1,2]$. Then by **triangle inequality**, which states > **Triangle inequality.** > For any two numbers $a,b$, we have $|a+b|\le |a|+|b|$. So we see $$ |f'(x)| \le |x\cos(x)| + |\sin(x)| \le |x||\cos(x)|+|\sin(x)| $$ on $[1,2]$. Finally, as $|x| \le 2$, $|\cos(x)|\le1$, $|\sin(x)|\le 1$ on $[1,2]$, we conclude that $$ |f'(x)| \le 2\cdot1+1=3 $$ So $K_{1}=3$ is an upper bound of $|f'(x)|$ on $[1,2]$. So by the theorem, we have the error to be$$ \text{error}\le K_{1} \frac{(2-1)^{2}}{N} = \frac{3}{N} $$ if we use either left or right end point Riemann estimate with $N$ pieces. So if we want to have an error no greater than $0.001$, picking $N$ such that $$ \text{error}\le \frac{3}{N} \le 0.001 $$suffices. In other words, take $N \ge \frac{3}{0.001} = 3000$ works. (In particular, $N=3000$ is good) Notice how this estimation depends on $N$. We can actually produce a better estimate (given $N$) if we know higher derivatives of the function at hand. Let's see some other estimations. ## Midpoint rule. As the name suggests, this method of estimation would take the mid point of each subinterval (instead of the left or right end points). Let us denote $M_{N}$ to be the $N$ piece midpoint estimation for $\int_{a}^{b}f(x)dx$. This is given by the following: > Fix $N$, and let $\Delta x = \frac{b-a}{N}$, and $x_{i}=a+i\Delta x$, for $i=0,1,2,\ldots,N$. > The **midpoint rule estimation** for $f(x)$ on $[a,b]$ with $N$ pieces is given by $$ \begin{align*} M_{N} & = f\left( \frac{x_{0}+x_{1}}{2}\right)\Delta x + f\left( \frac{x_{1}+x_{2}}{2} \right) \Delta x+\cdots + f\left( \frac{x_{N-1}+x_{N}}{2} \right)\Delta x \\ & = \sum_{i=1}^{N} f\left( \frac{x_{i-1}+x_{i}}{2} \right) \Delta x \end{align*} $$ !!! INSERT IMAGE HERE !!! How good is this estimation? > **Theorem. Error bound for midpoint rule.** > If $f$ is twice differentiable, then $$ \text{error} = \left| \int_{a}^{b}f(x)dx-M_{N}\right| \le K_{2}\frac{(b-a)^{3}}{24N^{2}} $$ where $K_{2}$ is any upper bound of $|f''(x)|$ on the interval $[a,b]$. In other words, using $N$ pieces midpoint rule estimation, the true value of the integral $\int_{a}^{b}f(x)dx$ is within $M_{N}\pm\text{error}$. **Example.** Just to illustrate using midpoint rule with $N=4$ to estimate $\int_{0}^{1}e^{x^{2}}dx$. Here the interval $[0,1]$ is divided into $4$ equal pieces and $\Delta x = 0.25$, and the corresponding midpoints are $0.125, 0.375, 0.625, 0.875$. Whence $$ M_{4}= e^{0.125^{2}} (0.25) + e^{0.375^{2}} (0.25)+ e^{0.625^{2}} (0.25)+ e^{0.875^{2}} (0.25) \approx 1.44875... $$ How good is this? To determine the error, we would need an upper bound of $|f''(x)|$ on $[0,1]$ where $f(x)=e^{x^{2}}$. Note $f'(x)=e^{x^{2}}(2x)$ and $f''(x)=e^{x^{2}}(4x^{2})+e^{x^{2}}(2)$. Let us estimate this. Note by the increasing nature of $x^{2}$ and $e^{x^{2}}$, they will be largest when $x=1$. So on $[0,1]$ we have $$ \begin{align*} |f''(x)| & =|e^{x^{2}}(4x^{2})+e^{x^{2}}(2)| \\ &= |e^{x^{2}}(4x^{2})|+|e^{x^{2}}(2)| \\ & = 4e + 2e = 6e \end{align*} $$Hence $6e$ is an upper bound of $|f''(x)|$ on $[0,1]$. So take $K_{2} = 6e$. Now applying the error bound theorem for midpoint rule, we have $$ \text{error}\le K_{2} \frac{(b-a)^{3}}{24N^{2}} = 6e \frac{(1-0)^{3}}{24(4)^{2}} =0.042... $$ So using midpoint rule with 4 pieces, we estimate $\int_{0}^{1}e^{x^{2}}dx$ to be within $1.44875\pm 0.042$. Observe that as $N$ increases, the error would get smaller! **Example.** Suppose we are to estimate the integral $\displaystyle\int_{0}^{1}e^{x^{2}}dx$ with midpoint rule. How many pieces $N$ do we need to estimate it within $0.0001$ of the true value of the integral? $\blacktriangleright$ From the previous example, we found $K_{2}=6e$ to be an upper bound for $|f''(x)|$ when $f(x)=e^{x^{2}}$. By the error bound theorem for midpoint rule, this means $$ \text{error} \le 6e \frac{(1-0)^{3}}{24N^{2}} = \frac{6e}{24N^{2}} $$So if we desire the error to be $\le 0.0001$, taking $$ \frac{6e}{24N^{2}} \le 0.0001 $$would work. In other words, take $N^{2}\ge \frac{6e}{24(0.0001)} \approx 6795.70...$, or $N\ge 82.4....$. That is, take $N=83$ works. $\blacklozenge$ ## Trapezoidal rule. Another way to estimate a definite integral is to use trapezoid instead. We shall denote $T_{N}$ to be the $N$ piece trapezoidal rule estimation for $\int_{a}^{b}f(x)dx$. > Fix $N$, divide the interval $[a,b]$ into $N$ equal pieces. This gives us end points $x_{i}=a + i\Delta x$, with width $\Delta x = \frac{b-a}{N}$. On the $i$-th interval $[x_{i-1},x_{i}]$, draw a trapezoid with heights $f(x_{i-1})$ and $f(x_{i})$, and width $\Delta x$. This $i$-th trapezoid has area $\frac{f(x_{i-1})+f(x_{i})}{2}\Delta x$. This gives the $N$ piece trapezoidal rule estimation for $\int_{a}^{b}f(x)dx$, $$ T_{N}=\sum_{i=1}^{N} \frac{f(x_{i-1})+f(x_{i})}{2}\Delta x $$ !!! INSERT IMAGE HERE !!! How good is this estimation? > **Theorem. Error bound for trapezoidal rule estimation.** > If $f$ is twice differentiable, then $$ \text{error} = \left| \int_{a}^{b}f(x)dx-T_{N}\right| \le K_{2}\frac{(b-a)^{3}}{12N^{2}} $$ where $K_{2}$ is any upper bound of $|f''(x)|$ on the interval $[a,b]$. Notice $L_{N},R_{N}$ are order $O(\frac{1}{N})$ estimations, while $M_{N}, T_{N}$ are order $O(\frac{1}{N^{2}})$ estimations. What this means is using large number of pieces $N$, midpoint rule and trapezoidal rule will be better than left or right end point Riemann estimation. Are there better methods? Mathematicians got to work. ## Simpson's rule. In previous methods, we estimate the area under the graph of $f(x)$ over a subinterval $[x_{i-1},x_{i}]$ using a shape with straight line segment. But in general the graph of $f(x)$ could be curved. So what if we use a curve instead? The "simplest" next interesting curve is a parabola (degree 2 polynomial). So let us estimate $\int_{a}^{b}f(x)dx$ with slivers of rectangles but with tiny parabola on top, where each parabola goes through three points on the curve. (The general problem of finding such a curve is called the **spline problem** or **interpolation problem**. You may have heard of **cubic Bezier curve**, which is a cubic version of this.) Since a parabola in general have three coefficients ($Ax^{2}+Bx+C$), we need three points to determine a parabola. So this method we require an even number of intervals, and using two intervals at a time, we have three points. Here is the final result (you can see derivation in text). > **Simpson's rule** > Over the interval $[a,b]$, take an **even** number $N$. Divide the interval equally into $N$ pieces, with width $\Delta x = \frac{b-a}{N}$. The end points are $x_{i} = a+i\Delta x$. Then the $N$ piece Simpson's rule estimation for $\int_{a}^{b}f(x)dx$ is given by $$ S_{N}=\frac{\Delta x}{3}[f(x_{0})+4f(x_{1})+2f(x_{2})+4f(x_{3})+\cdots+2f(x_{N-2})+4f(x_{N-1})+f(x_{N})] $$ The pattern of the coefficients are $1,4,2,4,2,4,\ldots,2,4,2,4,1$. !!! INSERT IMAGE HERE !!! How good is Simpson's rule? It turns out to be an order $O(\frac{1}{N^{4}})$ method! > **Theorem. Error bound for Simpson's rule estimation.** > If $f$ is four-times differentiable, then $$ \text{error} = \left| \int_{a}^{b}f(x)dx-S_{N}\right| \le K_{4}\frac{(b-a)^{5}}{180N^{4}} $$where $K_{4}$ is any upper bound of $|f^{(4)}(x)|$ on the interval $[a,b]$. (By the way, Simpson's rule was known about 100 years before Thomas Simpson, in the 1600s by Kepler (at least). But Simpson did turn out to be a great self-taught mathematician himself in the 1700s. He was a weaver by trade, where "he weaved during the day and teach mathematics at night".) Please read chapter 7.7. ///